179 research outputs found

    Multi-Objective Reinforcement Learning Based on Decomposition: A Taxonomy and Framework

    Full text link
    Multi-objective reinforcement learning (MORL) extends traditional RL by seeking policies making different compromises among conflicting objectives. The recent surge of interest in MORL has led to diverse studies and solving methods, often drawing from existing knowledge in multi-objective optimization based on decomposition (MOO/D). Yet, a clear categorization based on both RL and MOO/D is lacking in the existing literature. Consequently, MORL researchers face difficulties when trying to classify contributions within a broader context due to the absence of a standardized taxonomy. To tackle such an issue, this paper introduces multi-objective reinforcement learning based on decomposition (MORL/D), a novel methodology bridging the literature of RL and MOO. A comprehensive taxonomy for MORL/D is presented, providing a structured foundation for categorizing existing and potential MORL works. The introduced taxonomy is then used to scrutinize MORL research, enhancing clarity and conciseness through well-defined categorization. Moreover, a flexible framework derived from the taxonomy is introduced. This framework accommodates diverse instantiations using tools from both RL and MOO/D. Its versatility is demonstrated by implementing it in different configurations and assessing it on contrasting benchmark problems. Results indicate MORL/D instantiations achieve comparable performance to current state-of-the-art approaches on the studied problems. By presenting the taxonomy and framework, this paper offers a comprehensive perspective and a unified vocabulary for MORL. This not only facilitates the identification of algorithmic contributions but also lays the groundwork for novel research avenues in MORL.Comment: Accepted at JAI

    Design and analysis of an E-Puck2 robot plug-in for the ARGoS simulator

    Get PDF
    peer reviewedIn this article we present a new plug-in for the ARGoS swarm robotic simulator to implement the E-Puck2 robot model, including its graphical representation, sensors and actuators. We have based our development on the former E-Puck robot model (version 1) by upgrading the existing sensors (proximity, light, ground, camera, and battery) and adding new ones (time of flight and simulated encoders) implemented from scratch. We have adapted the values produced by the proximity, light and ground sensors, including the E-Puck2's onboard camera according to its resolution, and proposed four new discharge models for the battery. We have evaluated this new plug-in in terms of accuracy and efficiency through comparisons with real robots and extensive simulations. In all our experiments the proposed plug-in has worked well showing high levels of accuracy. The observed increment of execution times when using the studied sensors varies according to the number of robots and types of sensors included in the simulation, ranging from a negligible impact to 53% longer simulations in the most demanding cases.R-AGR-3933 - C20/IS/14762457/ADARS (01/05/2021 - 30/04/2024) - DANOY Grégoir

    Hyperparameter Optimization for Multi-Objective Reinforcement Learning

    Full text link
    Reinforcement learning (RL) has emerged as a powerful approach for tackling complex problems. The recent introduction of multi-objective reinforcement learning (MORL) has further expanded the scope of RL by enabling agents to make trade-offs among multiple objectives. This advancement not only has broadened the range of problems that can be tackled but also created numerous opportunities for exploration and advancement. Yet, the effectiveness of RL agents heavily relies on appropriately setting their hyperparameters. In practice, this task often proves to be challenging, leading to unsuccessful deployments of these techniques in various instances. Hence, prior research has explored hyperparameter optimization in RL to address this concern. This paper presents an initial investigation into the challenge of hyperparameter optimization specifically for MORL. We formalize the problem, highlight its distinctive challenges, and propose a systematic methodology to address it. The proposed methodology is applied to a well-known environment using a state-of-the-art MORL algorithm, and preliminary results are reported. Our findings indicate that the proposed methodology can effectively provide hyperparameter configurations that significantly enhance the performance of MORL agents. Furthermore, this study identifies various future research opportunities to further advance the field of hyperparameter optimization for MORL.Comment: Presented at the MODeM workshop https://modem2023.vub.ac.be/

    Optimising Autonomous Robot Swarm Parameters for Stable Formation Design

    Get PDF
    Autonomous robot swarm systems allow to address many inherent limitations of single robot systems, such as scalability and reliability. As a consequence, these have found their way into numerous applications including in the space and aerospace domains like swarm-based asteroid observation or counter-drone systems. However, achieving stable formations around a point of interest using different number of robots and diverse initial conditions can be challenging. In this article we propose a novel method for autonomous robots swarms self-organisation solely relying on their relative position (angle and distance). This work focuses on an evolutionary optimisation approach to calculate the parameters of the swarm, e.g. inter-robot distance, to achieve a reliable formation under different initial conditions. Experiments are conducted using realistic simulations and considering four case studies. The results observed after testing the optimal configurations on 72 unseen scenarios per case study showed the high robustness of our proposal since the desired formation was always achieved. The ability of self-organise around a point of interest maintaining a predefined fixed distance was also validated using real robots

    Optimising Robot Swarm Formations by Using Surrogate Models and Simulations

    Get PDF
    peer reviewedOptimising a swarm of many robots can be computationally demanding, especially when accurate simulations are required to evaluate the proposed robot configurations. Consequentially, the size of the instances and swarms must be limited, reducing the number of problems that can be addressed. In this article, we study the viability of using surrogate models based on Gaussian processes and artificial neural networks as predictors of the robots’ behaviour when arranged in formations surrounding a central point of interest. We have trained the surrogate models and tested them in terms of accuracy and execution time on five different case studies comprising three, five, ten, fifteen, and thirty robots. Then, the best performing predictors combined with ARGoS simulations have been used to obtain optimal configurations for the robot swarm by using our proposed hybrid evolutionary algorithm, based on a genetic algorithm and a local search. Finally, the best swarm configurations obtained have been tested on a number of unseen scenarios comprising different initial robot positions to evaluate the robustness and stability of the achieved robot formations. The best performing predictors exhibited speed increases of up to 3604 with respect to the ARGoS simulations. The optimisation algorithm converged in 91% of runs and stable robot formations were achieved in 79% of the unseen testing scenarios.R-AGR-3933 - C20/IS/14762457/ADARS (01/05/2021 - 30/04/2024) - DANOY Grégoir

    An Evolutionary Algorithm to Optimise a Distributed UAV Swarm Formation System

    Get PDF
    In this article, we present a distributed robot 3D formation system optimally parameterised by a hybrid evolutionary algorithm (EA) in order to improve its efficiency and robustness. To achieve that, we first describe the novel distributed formation algorithm3 (DFA3), the proposed EA, and the two crossover operators to be tested. The EA hyperparameterisation is performed by using the irace package and the evaluation of the three case studies featuring three, five, and ten unmanned aerial vehicles (UAVs) is performed through realistic simulations by using ARGoS and ten scenarios evaluated in parallel to improve the robustness of the configurations calculated. The optimisation results, reported with statistical significance, and the validation performed on 270 unseen scenarios show that the use of a metaheuristic is imperative for such a complex problem despite some overfitting observed under certain circumstances. All in all, the UAV swarm self-organised itself to achieve stable formations in 95% of the scenarios studied with a plus/minus ten percent tolerance

    Internet of Unmanned Aerial Vehicles—A Multilayer Low-Altitude Airspace Model for Distributed UAV Traffic Management

    Get PDF
    The rapid adoption of Internet of Things (IoT) has encouraged the integration of new connected devices such as Unmanned Aerial Vehicles (UAVs) to the ubiquitous network. UAVs promise a pragmatic solution to the limitations of existing terrestrial IoT infrastructure as well as bring new means of delivering IoT services through a wide range of applications. Owning to their potential, UAVs are expected to soon dominate the low-altitude airspace over populated cities. This introduces new research challenges such as the safe management of UAVs operation under high traffic demands. This paper proposes a novel way of structuring the uncontrolled, low-altitude airspace, with the aim of addressing the complex problem of UAV traffic management at an abstract level. The work, hence, introduces a model of the airspace as a weighted multilayer network of nodes and airways and presents a set of experimental simulation results using three UAV traffic management heuristics

    A Parallel Cooperative Coevolutionary SMPSO Algorithm for Multi-objective Optimization

    Get PDF
    We present a parallel multi-objective cooperative coevolutionary variant of the Speed-constrained Multi-objective Particle Swarm Optimization (SMPSO) algorithm. The algorithm, called CCSMPSO, is the first multi-objective cooperative coevolutionary algorithm based on PSO in the literature. SMPSO adopts a strategy for limiting the velocity of the particles that prevents them from having erratic movements. This characteristic provides the algorithm with a high degree of reliability. In order to demonstrate the effectiveness of CCSMPSO, we compare our work with the original SMPSO and three different state-of-the-art multi-objective CC metaheuristics, namely CCNSGA-II, CCSPEA2 and CCMOCell, along with their original sequential counterparts. Our experiments indicate that our proposed solution, CCSMPSO, offers significant computational speedups, a higher convergence speed and better or comparable results in terms of solution quality, when evaluated against three other CC algorithms and four state-of-the-art optimizers (namely SMPSO, NSGA-II, SPEA2, and MOCell), respectively. We then provide a scalability analysis, which consists of two studies. First, we analyze how the algorithms scale when varying the problem size, i.e., the number of variables. Second, we analyze their scalability in terms of parallelization, i.e., the impact of using more computational cores on the quality of solutions and on the execution time of the algorithms. Three different criteria are used for making the comparisons, namely the quality of the resulting approximation sets, average computational time and the convergence speed to the Pareto front
    • …
    corecore